AITopics

Country:

North America > United States > California (0.14)
North America > Canada > Newfoundland and Labrador > Newfoundland > St. John's (0.04)
Asia > Thailand > Bangkok > Bangkok (0.04)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Data Science > Data Mining (0.68)

Neural Information Processing SystemsDec-26-2025, 00:38:05 GMT

Anonymous Learning via Look-Alike Clustering: A Precise Analysis of Model Generalization

While personalized recommendations systems have become increasingly popular, ensuring user data protection remains a top concern in the development of these learning systems. A common approach to enhancing privacy involves training models using anonymous data rather than individual data. In this paper, we explore a natural technique called look-alike clustering, which involves replacing sensitive features of individuals with the cluster's average values. We provide a precise analysis of how training models using anonymous cluster centers affects their generalization capabilities. We focus on an asymptotic regime where the size of the training set grows in proportion to the features dimension. Our analysis is based on the Convex Gaussian Minimax Theorem (CGMT) and allows us to theoretically understand the role of different model components on the generalization error. In addition, we demonstrate that in certain high-dimensional regimes, training over anonymous cluster centers acts as a regularization and improves generalization error of the trained models. Finally, we corroborate our asymptotic theory with finite-sample numerical experiments where we observe a perfect match when the sample size is only of order of a few hundreds.

anonymous learning, look-alike clustering, precise analysis, (6 more...)

Industry: Information Technology > Security & Privacy (0.97)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.77)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.60)

Neural Information Processing SystemsDec-25-2025, 04:42:29 GMT

Efficient Generalization with Distributionally Robust Learning

Distributionally robust learning (DRL) is increasingly seen as a viable method to train machine learning models for improved model generalization. These min-max formulations, however, are more difficult to solve. We provide a new stochastic gradient descent algorithm to efficiently solve this DRL formulation. Our approach applies gradient descent to the outer minimization formulation and estimates the gradient of the inner maximization based on a sample average approximation. The latter uses a subset of the data sampled without replacement in each iteration, progressively increasing the subset size to ensure convergence. We rigorously establish convergence to a near-optimal solution under standard regularity assumptions and, for strongly convex losses, match the best known $O(\epsilon{ 1})$ rate of convergence up to a known threshold. Empirical results demonstrate the significant benefits of our approach over previous work in improving learning for model generalization.

distributionally robust learning, efficient generalization, name change, (4 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.85)

Nguyen, Dang Nha, Nguyen, Hai Dang, Nguyen, Khoa Tho Anh

IBMA: An Imputation-Based Mixup Augmentation Using Self-Supervised Learning for Time Series Data

arXiv.org Artificial IntelligenceNov-12-2025

Data augmentation plays a crucial role in enhancing model performance across various AI fields by introducing variability while maintaining the underlying temporal patterns. However, in the context of long sequence time series data, where maintaining temporal consistency is critical, there are fewer augmentation strategies compared to fields such as image or text, with advanced techniques like Mixup rarely being used. In this work, we propose a new approach, Imputation-based Mixup Augmentation (IMA), which combines Imputed-data Augmentation with Mixup Augmentation to bolster model generalization and improve forecasting performance. We evaluate the effectiveness of this method across several forecasting models, including DLinear (MLP), TimesNet (CNN), and iTrainformer (Transformer), these models represent some of the most recent advances in long sequence time series forecasting. Our experiments, conducted on three datasets (ETT -small, Illness, Exchange Rate) from various domains and compared against eight other augmentation techniques, demonstrate that IMA consistently enhances performance, achieving 22 improvements out of 24 instances, with 10 of those being the best performances, particularly with iTrain-former imputation in ETT dataset. The GitHub repository is available at: https://github.com/dangnha/IMA.

data mining, machine learning, natural language, (16 more...)

2511.0793

Country: Asia > Vietnam (0.47)

Genre: Research Report (0.82)

Industry: Health & Medicine (0.48)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Mai, Duong, Hall, Lawrence

Noise Injection: Improving Out-of-Distribution Generalization for Limited Size Datasets

arXiv.org Artificial IntelligenceNov-7-2025

Deep learned (DL) models for image recognition have been shown to fail to generalize to data from different devices, populations, etc. COVID-19 detection from Chest X-rays (CXRs), in particular, has been shown to fail to generalize to out-of-distribution (OOD) data from new clinical sources not covered in the training set. This occurs because models learn to exploit shortcuts - source-specific artifacts that do not translate to new distributions - rather than reasonable biomarkers to maximize performance on in-distribution (ID) data. Rendering the models more robust to distribution shifts, our study investigates the use of fundamental noise injection techniques (Gaussian, Speckle, Poisson, and Salt and Pepper) during training. Our empirical results demonstrate that this technique can significantly reduce the performance gap between ID and OOD evaluation from 0.10 0.20 to 0.01 0.06, based on results averaged over ten random seeds across key metrics such as AUC, F1, accuracy, recall and specificity.

artificial intelligence, machine learning, pattern recognition, (18 more...)

2511.03855

Country: North America > United States > Florida > Hillsborough County > Tampa (0.14)

Genre: Research Report (0.84)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (0.95)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.80)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.34)

Neural Information Processing SystemsOct-8-2025, 21:40:20 GMT

70899a5d74f83317c78f1a7d413d1baa-Paper-Conference.pdf

data mining, machine learning, sensitive feature, (19 more...)

Country:

North America > United States > California (0.14)
North America > Canada > Newfoundland and Labrador > Newfoundland > St. John's (0.04)
Asia > Thailand > Bangkok > Bangkok (0.04)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Data Science > Data Mining (0.68)

Piland, Jacob, Sweet, Chris, Czajka, Adam

Divisive Decisions: Improving Salience-Based Training for Generalization in Binary Classification Tasks

arXiv.org Artificial IntelligenceJul-24-2025

Existing saliency-guided training approaches improve model generalization by incorporating a loss term that compares the model's class activation map (CAM) for a sample's true-class ({\it i.e.}, correct-label class) against a human reference saliency map. However, prior work has ignored the false-class CAM(s), that is the model's saliency obtained for incorrect-label class. We hypothesize that in binary tasks the true and false CAMs should diverge on the important classification features identified by humans (and reflected in human saliency maps). We use this hypothesis to motivate three new saliency-guided training methods incorporating both true- and false-class model's CAM into the training strategy and a novel post-hoc tool for identifying important features. We evaluate all introduced methods on several diverse binary close-set and open-set classification tasks, including synthetic face detection, biometric presentation attack detection, and classification of anomalies in chest X-ray scans, and find that the proposed methods improve generalization capabilities of deep learning models over traditional (true-class CAM only) saliency-guided training approaches. We offer source codes and model weights\footnote{GitHub repository link removed to preserve anonymity} to support reproducible research.

artificial intelligence, machine learning, salience 0, (16 more...)

2507.17

Country: North America > United States (0.14)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine > Diagnostic Medicine (0.46)
Information Technology > Security & Privacy (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Alghoul, Karim, Osman, Hussein Al, Saddik, Abdulmotaleb El

Enhancing Generalization in PPG-Based Emotion Measurement with a CNN-TCN-LSTM Model

arXiv.org Artificial IntelligenceJul-23-2025

Human computer interaction has become integral to modern life, driven by advancements in machine learning technologies. Affective computing, in particular, has focused on systems that recognize, interpret, and respond to human emotions, often using wearable devices, which provide continuous data streams of physiological signals. Among various physiological signals, the photoplethysmogram (PPG) has gained prominence due to its ease of acquisition from widely available devices. However, the generalization of PPG-based emotion recognition models across individuals remains an unresolved challenge. This paper introduces a novel hybrid architecture that combines Convolutional Neural Networks (CNNs), Long Short-Term Memory networks (LSTMs), and Temporal Convolutional Networks (TCNs) to address this issue. The proposed model integrates the strengths of these architectures to improve robustness and generalization. Raw PPG signals are fed into the CNN for feature extraction. These features are processed separately by LSTM and TCN. The outputs from these components are concatenated to generate a final feature representation, which serves as the input for classifying valence and arousal, the primary dimensions of emotion. Experiments using the Photoplethysmogram Dataset for Emotional Analysis (PPGE) demonstrate that the proposed hybrid model achieves better model generalization than standalone CNN and LSTM architectures. Our results show that the proposed solution outperforms the state-of-the-art CNN architecture, as well as a CNN-LSTM model, in emotion recognition tasks with PPG signals. Using metrics such as Area Under the Curve (AUC) and F1 Score, we highlight the model's effectiveness in handling subject variability.

artificial intelligence, deep learning, machine learning, (14 more...)

doi: 10.1109/I2MTC62753.2025.11079085

2507.14173

Country:

North America > United States (0.14)
North America > Canada (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Diagnostic Medicine (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Machine LearningJul-1-2025

Learning Time-Aware Causal Representation for Model Generalization in Evolving Domains

He, Zhuo, Li, Shuang, Song, Wenze, Yuan, Longhui, Liang, Jian, Li, Han, Gai, Kun

Endowing deep models with the ability to generalize in dynamic scenarios is of vital significance for real-world deployment, given the continuous and complex changes in data distribution. Recently, evolving domain generalization (EDG) has emerged to address distribution shifts over time, aiming to capture evolving patterns for improved model generalization. However, existing EDG methods may suffer from spurious correlations by modeling only the dependence between data and targets across domains, creating a shortcut between task-irrelevant factors and the target, which hinders generalization. To this end, we design a time-aware structural causal model (SCM) that incorporates dynamic causal factors and the causal mechanism drifts, and propose \textbf{S}tatic-D\textbf{YN}amic \textbf{C}ausal Representation Learning (\textbf{SYNC}), an approach that effectively learns time-aware causal representations. Specifically, it integrates specially designed information-theoretic objectives into a sequential VAE framework which captures evolving patterns, and produces the desired representations by preserving intra-class compactness of causal factors both across and within domains. Moreover, we theoretically show that our method can yield the optimal causal predictor for each time domain. Results on both synthetic and real-world datasets exhibit that SYNC can achieve superior temporal generalization performance.

artificial intelligence, generalization, machine learning, (13 more...)

arXiv.org Machine Learning

2506.17718

Country:

North America > Canada (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.81)

Industry: Information Technology (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.67)